May 16, 2016

Books

Books

Getting and Using R

Why R?

  • Because ISLR!
  • I believe it's the best environment for learning the concepts of machine learning
  • R + RStudio are free and open source and easy to use
  • Concepts can be directly applied to other environments, particularly Python
  • We're all programmers and my experience tells me you won't have a problem with learning R!

Modern R

  • Hadley Wickham, "The Hadleyverse"
    • ggplot2
    • dplyr
    • tidyr

Learning R

Learning R

Goals of the Data Science Meetup

  • Have fun!
    • Forum for taking some risk
  • Learn about the topic of machine learning
  • ISLR is not the end

Structure of the Data Science Meetup

  • Structured sessions
    • Interaction encouraged
  • Presentations availble on BMC github
    • Available on BMC github
  • Regular sessions and will be recorded
  • Labs
    • Volunteer to present?
  • Open ended

Introduction to Statistical Learning

Exploring Data

  • Quantitative/continuous values
    • Age
    • Wage
  • Qualitative/categorical values
    • Education level
    • Stock movement (up, down)
    • Low, medium, high (ordered categorical)

History of Statistical Learning

  • Method of Least Squares
    • Legendre, Gauss beginning 19th century
    • Basis of linear regression
    • Predict quantitative values
  • Linear Discriminant Analysis
    • Fisher, 1936
    • Predict qualitative values
  • Logistic Regression
    • Various, 1940's
    • Alternative to LDA
  • Generalized Linear Models
    • Nelder and Wedderburn, early 1970's

ISLR Premises

  • Many statistical learning methods are relevant and useful in a wide range of academic and non-academic disciplines, beyond just the statistical sciences
  • Statistical learning should not be viewed as a series of black boxes
  • While it is important to know what job is performed by each cog, it is not necessary to have the skills to construct the machine inside the box!
  • We presume that the reader is interested in applying statistical learning methods to real-world problems